48 research outputs found
The Circle of Meaning: From Translation to Paraphrasing and Back
The preservation of meaning between inputs and outputs is perhaps
the most ambitious and, often, the most elusive goal of systems
that attempt to process natural language. Nowhere is this goal of
more obvious importance than for the tasks of machine translation
and paraphrase generation. Preserving meaning between the input and
the output is paramount for both, the monolingual vs bilingual distinction
notwithstanding. In this thesis, I present a novel, symbiotic relationship
between these two tasks that I term the "circle of meaning''.
Today's statistical machine translation (SMT) systems require high
quality human translations for parameter tuning, in addition to
large bi-texts for learning the translation units. This parameter
tuning usually involves generating translations at different points
in the parameter space and obtaining feedback against human-authored
reference translations as to how good the translations. This feedback
then dictates what point in the parameter space should be explored
next. To measure this feedback, it is generally considered wise to have
multiple (usually 4) reference translations to avoid unfair penalization of translation
hypotheses which could easily happen given the large number of ways in which
a sentence can be translated from one language to another. However, this reliance on multiple reference translations
creates a problem since they are labor intensive and expensive to obtain.
Therefore, most current MT datasets only contain a single reference.
This leads to the problem of reference sparsity---the primary open problem
that I address in this dissertation---one that has a serious effect on the
SMT parameter tuning process.
Bannard and Callison-Burch (2005) were the first to provide a practical
connection between phrase-based statistical machine translation and paraphrase
generation. However, their technique is restricted to generating phrasal
paraphrases. I build upon their approach and augment a phrasal paraphrase
extractor into a sentential paraphraser with extremely broad coverage.
The novelty in this augmentation lies in the further strengthening of
the connection between statistical machine translation and paraphrase
generation; whereas Bannard and Callison-Burch only relied on SMT machinery
to extract phrasal paraphrase rules and stopped there, I take it a few
steps further and build a full English-to-English SMT system. This system
can, as expected, ``translate'' any English input sentence into a new English
sentence with the same degree of meaning preservation that exists in a bilingual
SMT system. In fact, being a state-of-the-art SMT system, it is able to generate
n-best "translations" for any given input sentence. This sentential
paraphraser, built almost entirely from existing SMT machinery, represents
the first 180 degrees of the circle of meaning.
To complete the circle, I describe a novel connection in the other direction.
I claim that the sentential paraphraser, once built in this fashion, can
provide a solution to the reference sparsity problem and, hence, be used
to improve the performance a bilingual SMT system. I discuss two different
instantiations of the sentential paraphraser and show several results that
provide empirical validation for this connection
The Hiero Machine Translation System: Extensions, Evaluation, and Analysis
Hierarchical organization is a well known property of language, and yet the notion of hierarchical structure has been largely absent from the best performing machine translation systems in recent community-wide evaluations. In this paper, we discuss a new hierarchical phrase-based statistical machine translation system (Chiang, 2005), presenting recent extensions to the original proposal, new evaluation results in a community-wide evaluation, and a novel technique for fine-grained comparative analysis of MT systems.
Measuring Variability in Sentence Ordering for News Summarization
The issue of sentence ordering is an important one for natural language tasks such as multi-document summarization, yet there has not been a quantitative exploration of the range of acceptable sentence orderings for short texts. We present results of a sentence reordering experiment with three experimental conditions. Our findings indicate a very high degree of variability in the orderings that the eighteen subjects produce. In addition, the variability of reorderings is significantly greater when the initial ordering seen by subjects is different from the original summary. We conclude that evaluation of sentence ordering should use multiple reference orderings. Our evaluation presents several metrics that might prove useful in assessing against multiple references. We conclude with a deeper set of questions: (a) what sorts of independent assessments of quality of the different reference orderings could be made and (b) whether a large enough test set would obviate the need for such independent means of quality assessment